target structure
Multi-state Protein Design with DynamicMPNN
Abrudan, Alex, Ojeda, Sebastian Pujalte, Joshi, Chaitanya K., Greenig, Matthew, Engelberger, Felipe, Khmelinskaia, Alena, Meiler, Jens, Vendruscolo, Michele, Knowles, Tuomas P. J.
Structural biology has long been dominated by the one sequence, one structure, one function paradigm, yet many critical biological processes--from enzyme catalysis to membrane transport--depend on proteins that adopt multiple conformational states. Existing multi-state design approaches rely on post-hoc aggregation of single-state predictions, achieving poor experimental success rates compared to single-state design. We introduce DynamicMPNN, an inverse folding model explicitly trained to generate sequences compatible with multiple conformations through joint learning across conformational ensembles. Trained on 46,033 conformational pairs covering 75% of CA TH superfamilies and evaluated using Alphafold 3, DynamicMPNN outperforms ProteinMPNN by up to 25% on decoy-normalized RMSD and by 12% on sequence recovery across our challenging multi-state protein benchmark. A commonly derived assumption from Anfinsen's experiment is that proteins adopt only one native 3D structure, leading to the "one sequence, one structure, one function" canon. This view has been indirectly reinforced by the predominant use of X-Ray crystallography in experimental protein structure determination, which requires that proteins form a homogenous, diffractible crystal to be characterised (Dishman & V olkman, 2018).
BuilderBench -- A benchmark for generalist agents
Ghugare, Raj, Ji, Catherine, Wantlin, Kathryn, Schofield, Jin, Eysenbach, Benjamin
Today's AI models learn primarily through mimicry and sharpening, so it is not surprising that they struggle to solve problems beyond the limits set by existing data. To solve novel problems, agents should acquire skills for exploring and learning through experience. Finding a scalable learning mechanism for developing agents that learn through interaction remains a major open problem. In this work, we introduce BuilderBench, a benchmark to accelerate research into agent pre-training that centers open-ended exploration. BuilderBench requires agents to learn how to build any structure using blocks. BuilderBench is equipped with $(1)$ a hardware accelerated simulator of a robotic agent interacting with various physical blocks, and $(2)$ a task-suite with over 42 diverse target structures that are carefully curated to test an understanding of physics, mathematics, and long-horizon planning. During training, agents have to explore and learn general principles about the environment without any external supervision. During evaluation, agents have to build the unseen target structures from the task suite. Solving these tasks requires a sort of \emph{embodied reasoning} that is not reflected in words but rather in actions, experimenting with different strategies and piecing them together. Our experiments show that many of these tasks challenge the current iteration of algorithms. Hence, we also provide a ``training wheels'' protocol, in which agents are trained and evaluated to build a single target structure from the task suite. Finally, we provide single-file implementations of six different algorithms as a reference point for researchers.
IDfRA: Self-Verification for Iterative Design in Robotic Assembly
Khendry, Nishka, Margadji, Christos, Pattinson, Sebastian W.
As robots proliferate in manufacturing, Design for Robotic Assembly (DfRA), which is designing products for efficient automated assembly, is increasingly important. Traditional approaches to DfRA rely on manual planning, which is time-consuming, expensive and potentially impractical for complex objects. Large language models (LLM) have exhibited proficiency in semantic interpretation and robotic task planning, stimulating interest in their application to the automation of DfRA. But existing methodologies typically rely on heuristic strategies and rigid, hard-coded physics simulators that may not translate into real-world assembly contexts. In this work, we present Iterative Design for Robotic Assembly (IDfRA), a framework using iterative cycles of planning, execution, verification, and re-planning, each informed by self-assessment, to progressively enhance design quality within a fixed yet initially under-specified environment, thereby eliminating the physics simulation with the real world itself. The framework accepts as input a target structure together with a partial environmental representation. Through successive refinement, it converges toward solutions that reconcile semantic fidelity with physical feasibility. Empirical evaluation demonstrates that IDfRA attains 73.3\% top-1 accuracy in semantic recognisability, surpassing the baseline on this metric. Moreover, the resulting assembly plans exhibit robust physical feasibility, achieving an overall 86.9\% construction success rate, with design quality improving across iterations, albeit not always monotonically. Pairwise human evaluation further corroborates the advantages of IDfRA relative to alternative approaches. By integrating self-verification with context-aware adaptation, the framework evidences strong potential for deployment in unstructured manufacturing scenarios.
BAP v2: An Enhanced Task Framework for Instruction Following in Minecraft Dialogues
Jayannavar, Prashant, Ren, Liliang, Hudspeth, Marisa, Lambert, Charlotte, Cordes, Ariel, Kaplan, Elizabeth, Narayan-Chen, Anjali, Hockenmaier, Julia
Interactive agents capable of understanding and executing instructions in the physical world have long been a central goal in AI research. The Minecraft Collaborative Building Task (MCBT) provides one such setting to work towards this goal (Narayan-Chen, Jayannavar, and Hockenmaier 2019). It is a two-player game in which an Architect (A) instructs a Builder (B) to construct a target structure in a simulated Blocks World Environment. We focus on the challenging Builder Action Prediction (BAP) subtask of predicting correct action sequences in a given multimodal game context with limited training data (Jayannavar, Narayan-Chen, and Hockenmaier 2020). We take a closer look at evaluation and data for the BAP task, discovering key challenges and making significant improvements on both fronts to propose BAP v2, an upgraded version of the task. This will allow future work to make more efficient and meaningful progress on it. It comprises of: (1) an enhanced evaluation benchmark that includes a cleaner test set and fairer, more insightful metrics, and (2) additional synthetic training data generated from novel Minecraft dialogue and target structure simulators emulating the MCBT. We show that the synthetic data can be used to train more performant and robust neural models even with relatively simple training methods. Looking ahead, such data could also be crucial for training more sophisticated, data-hungry deep transformer models and training/fine-tuning increasingly large LLMs. Although modeling is not the primary focus of this work, we also illustrate the impact of our data and training methodologies on a simple LLM- and transformer-based model, thus validating the robustness of our approach, and setting the stage for more advanced architectures and LLMs going forward.
Hybrid-Neuromorphic Approach for Underwater Robotics Applications: A Conceptual Framework
Sudevan, Vidya, Zayer, Fakhreddine, Javed, Sajid, Karki, Hamad, De Masi, Giulia, Dias, Jorge
This paper introduces the concept of employing neuromorphic methodologies for task-oriented underwater robotics applications. In contrast to the increasing computational demands of conventional deep learning algorithms, neuromorphic technology, leveraging spiking neural network architectures, promises sophisticated artificial intelligence with significantly reduced computational requirements and power consumption, emulating human brain operational principles. Despite documented neuromorphic technology applications in various robotic domains, its utilization in marine robotics remains largely unexplored. Thus, this article proposes a unified framework for integrating neuromorphic technologies for perception, pose estimation, and haptic-guided conditional control of underwater vehicles, customized to specific user-defined objectives. This conceptual framework stands to revolutionize underwater robotics, enhancing efficiency and autonomy while reducing energy consumption. By enabling greater adaptability and robustness, this advancement could facilitate applications such as underwater exploration, environmental monitoring, and infrastructure maintenance, thereby contributing to significant progress in marine science and technology.
Towards No-Code Programming of Cobots: Experiments with Code Synthesis by Large Code Models for Conversational Programming
Kranti, Chalamalasetti, Hakimov, Sherzod, Schlangen, David
While there has been a lot of research recently on robots in household environments, at the present time, most robots in existence can be found on shop floors, and most interactions between humans and robots happen there. ``Collaborative robots'' (cobots) designed to work alongside humans on assembly lines traditionally require expert programming, limiting ability to make changes, or manual guidance, limiting expressivity of the resulting programs. To address these limitations, we explore using Large Language Models (LLMs), and in particular, their abilities of doing in-context learning, for conversational code generation. As a first step, we define RATS, the ``Repetitive Assembly Task'', a 2D building task designed to lay the foundation for simulating industry assembly scenarios. In this task, a `programmer' instructs a cobot, using natural language, on how a certain assembly is to be built; that is, the programmer induces a program, through natural language. We create a dataset that pairs target structures with various example instructions (human-authored, template-based, and model-generated) and example code. With this, we systematically evaluate the capabilities of state-of-the-art LLMs for synthesising this kind of code, given in-context examples. Evaluating in a simulated environment, we find that LLMs are capable of generating accurate `first order code' (instruction sequences), but have problems producing `higher-order code' (abstractions such as functions, or use of loops).
IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents
Mohanty, Shrestha, Arabzadeh, Negar, Tupini, Andrea, Sun, Yuxuan, Skrynnik, Alexey, Zholus, Artem, Côté, Marc-Alexandre, Kiseleva, Julia
Seamless interaction between AI agents and humans using natural language remains a key goal in AI research. This paper addresses the challenges of developing interactive agents capable of understanding and executing grounded natural language instructions through the IGLU competition at NeurIPS. Despite advancements, challenges such as a scarcity of appropriate datasets and the need for effective evaluation platforms persist. We introduce a scalable data collection tool for gathering interactive grounded language instructions within a Minecraft-like environment, resulting in a Multi-Modal dataset with around 9,000 utterances and over 1,000 clarification questions. Additionally, we present a Human-in-the-Loop interactive evaluation platform for qualitative analysis and comparison of agent performance through multi-turn communication with human annotators. We offer to the community these assets referred to as IDAT (IGLU Dataset And Toolkit) which aim to advance the development of intelligent, interactive AI agents and provide essential resources for further research.
Solving Inverse Problems in Protein Space Using Diffusion-Based Priors
Levy, Axel, Chan, Eric R., Fridovich-Keil, Sara, Poitevin, Frédéric, Zhong, Ellen D., Wetzstein, Gordon
The interaction of a protein with its environment can be understood and controlled via its 3D structure. Experimental methods for protein structure determination, such as X-ray crystallography or cryogenic electron microscopy, shed light on biological processes but introduce challenging inverse problems. Learning-based approaches have emerged as accurate and efficient methods to solve these inverse problems for 3D structure determination, but are specialized for a predefined type of measurement. Here, we introduce a versatile framework to turn raw biophysical measurements of varying types into 3D atomic models. Our method combines a physics-based forward model of the measurement process with a pretrained generative model providing a task-agnostic, data-driven prior. Our method outperforms posterior sampling baselines on both linear and non-linear inverse problems. In particular, it is the first diffusion-based method for refining atomic models from cryo-EM density maps.
A Novel Optimization-Based Collision Avoidance For Autonomous On-Orbit Assembly
Tavana, Siavash, Faghihi, Sepideh, de Ruiter, Anton, Kumar, Krishna Dev
The collision avoidance constraints are prominent as non-convex, non-differentiable, and challenging when defined in optimization-based motion planning problems. To overcome these issues, this paper presents a novel non-conservative collision avoidance technique using the notion of convex optimization to establish the distance between robotic spacecraft and space structures for autonomous on-orbit assembly operations. The proposed technique defines each ellipsoidal- and polyhedral-shaped object as the union of convex compact sets, each represented non-conservatively by a real-valued convex function. Then, the functions are introduced as a set of constraints to a convex optimization problem to produce a new set of differentiable constraints resulting from the optimality conditions. These new constraints are later fed into an optimal control problem to enforce collision avoidance where the motion planning for the autonomous on-orbit assembly takes place. Numerical experiments for two assembly scenarios in tight environments are presented to demonstrate the capability and effectiveness of the proposed technique. The results show that this framework leads to optimal non-conservative trajectories for robotic spacecraft in tight environments. Although developed for autonomous on-orbit assembly, this technique could be used for any generic motion planning problem where collision avoidance is crucial.